46 research outputs found

    Online Self-Healing Control Loop to Prevent and Mitigate Faults in Scientific Workflows

    Get PDF
    Scientific workflows have become mainstream for conducting large-scale scientific research. As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms. In spite of many success stories, a key challenge for running workflow in distributed systems is failure prediction, detection, and recovery. In this paper, we present a novel online self-healing framework, where failures are predicted before they happen, and are mitigated when possible. The proposed approach is to use control theory developed as part of autonomic computing, and in particular apply the proportional-integral-derivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, to mitigate faults by adjusting the inputs of the mechanism. The PID controller aims at detecting the possibility of a fault far enough in advance so that an action can be performed to prevent it from happening. To demonstrate the feasibility of the approach, we tackle two common execution faults of the Big Data era—data footprint and memory usage. We define, implement, and evaluate PID controllers to autonomously manage data and memory usage of a bioinformatics workflow that consumes/produces over 4.4TB of data, and requires over 24TB of memory to run all tasks concurrently. Experimental results indicate that workflow executions may significantly benefit from PID controllers, in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdown of 1.01) can be attained when using our proposed control loop, and faults are detected and mitigated far in advance

    GWAS and meta-analysis identifies 49 genetic variants underlying critical COVID-19

    Get PDF
    Critical illness in COVID-19 is an extreme and clinically homogeneous disease phenotype that we have previously shown1 to be highly efficient for discovery of genetic associations2. Despite the advanced stage of illness at presentation, we have shown that host genetics in patients who are critically ill with COVID-19 can identify immunomodulatory therapies with strong beneficial effects in this group3. Here we analyse 24,202 cases of COVID-19 with critical illness comprising a combination of microarray genotype and whole-genome sequencing data from cases of critical illness in the international GenOMICC (11,440 cases) study, combined with other studies recruiting hospitalized patients with a strong focus on severe and critical disease: ISARIC4C (676 cases) and the SCOURGE consortium (5,934 cases). To put these results in the context of existing work, we conduct a meta-analysis of the new GenOMICC genome-wide association study (GWAS) results with previously published data. We find 49 genome-wide significant associations, of which 16 have not been reported previously. To investigate the therapeutic implications of these findings, we infer the structural consequences of protein-coding variants, and combine our GWAS results with gene expression data using a monocyte transcriptome-wide association study (TWAS) model, as well as gene and protein expression using Mendelian randomization. We identify potentially druggable targets in multiple systems, including inflammatory signalling ( JAK1), monocyte–macrophage activation and endothelial permeability (PDE4A), immunometabolism (SLC2A5 and AK5), and host factors required for viral entry and replication (TMPRSS2 and RAB2A)

    Using simple PID-inspired controllers for online resilient resource management of distributed scientific workflows

    Get PDF
    Scientific workflows have become mainstream for conducting large-scale scientific research. As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms. Although the scientific community has addressed this challenge from both theoretical and practical approaches, failure prediction, detection, and recovery still raise many research questions. In this paper, we propose an approach inspired by the control theory developed as part of autonomic computing to predict failures before they happen, and mitigated them when possible. The proposed approach is inspired on the proportional–integral–derivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, where the controller will react to adjust its output to mitigate faults. PID controllers aim to detect the possibility of a non-steady state far enough in advance so that an action can be performed to prevent it from happening. To demonstrate the feasibility of the approach, we tackle two common execution faults of large scale data-intensive workflows—data storage overload and memory overflow. We developed a simulator, which implements and evaluates simple standalone PID-inspired controllers to autonomously manage data and memory usage of a data-intensive bioinformatics workflow that consumes/produces over 4.4 TB of data, and requires over 24 TB of memory to run all tasks concurrently. Experimental results obtained via simulation indicate that workflow executions may significantly benefit from the controller-inspired approach, in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdown of 1.01) can be attained when using our proposed method, and faults are detected and mitigated far in advance of their occurrence

    Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Workflows

    Get PDF
    Scientific workflows have become mainstream for conductinglarge-scale scientific research. As a result, many workflowapplications and Workflow Management Systems (WMSs)have been developed as part of the cyberinfrastructure toallow scientists to execute their applications seamlessly ona range of distributed platforms. In spite of many successstories, a key challenge for running workflows in distributedsystems is failure prediction, detection, and recovery. Inthis paper, we propose an approach to use control theorydeveloped as part of autonomic computing to predict failures before they happen, and mitigated them when possible.The proposed approach applying the proportional-integralderivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, tomitigate faults by adjusting the inputs of the controller. ThePID controller aims at detecting the possibility of a fault farenough in advance so that an action can be performed toprevent it from happening. To demonstrate the feasibility ofthe approach, we tackle two common execution faults of theBig Data era—data storage overload and memory overflow.We define, implement, and evaluate simple PID controllersto autonomously manage data and memory usage of a bioinformatics workflow that consumes/produces over 4.4TB ofdata, and requires over 24TB of memory to run all tasksconcurrently. Experimental results indicate that workflowexecutions may significantly benefit from PID controllers,in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdownof 1.01) can be attained when using our proposed method,and faults are detected and mitigated far in advance of theiroccurence

    Functional Transcription Factor Target Networks Illuminate Control of Epithelial Remodelling

    Get PDF
    Cell identity is governed by gene expression, regulated by transcription factor (TF) binding at cis-regulatory modules. Decoding the relationship between TF binding patterns and gene regulation is nontrivial, remaining a fundamental limitation in understanding cell decision-making. We developed the NetNC software to predict functionally active regulation of TF targets; demonstrated on nine datasets for the TFs Snail, Twist, and modENCODE Highly Occupied Target (HOT) regions. Snail and Twist are canonical drivers of epithelial to mesenchymal transition (EMT), a cell programme important in development, tumour progression and fibrosis. Predicted “neutral” (non-functional) TF binding always accounted for the majority (50% to 95%) of candidate target genes from statistically significant peaks and HOT regions had higher functional binding than most of the Snail and Twist datasets examined. Our results illuminated conserved gene networks that control epithelial plasticity in development and disease. We identified new gene functions and network modules including crosstalk with notch signalling and regulation of chromatin organisation, evidencing networks that reshape Waddington’s epigenetic landscape during epithelial remodelling. Expression of orthologous functional TF targets discriminated breast cancer molecular subtypes and predicted novel tumour biology, with implications for precision medicine. Predicted invasion roles were validated using a tractable cell model, supporting our approach

    A custom capture sequence approach for oculocutaneous albinism identifies structural variant alleles at the OCA2 locus

    Get PDF
    Oculocutaneous albinism (OCA) is a heritable disorder of pigment production that manifests as hypopigmentation and altered eye development. Exon sequencing of known OCA genes is unsuccessful in producing a complete molecular diagnosis for a significant number of affected individuals. We sequenced the DNA of individuals with OCA using short-read custom capture sequencing that targeted coding, intronic and non-coding regulatory regions of known OCA genes and GWAS-associated pigmentation loci. We identified an OCA2 complex structural variant (CxSV), defined by a 143kb inverted segment reintroduced in intron 1, upstream of the native location. The corresponding CxSV junctions were observed in 11/390 probands screened. The 143kb CxSV presents in one family as a copy number variant (CNV) duplication for the 143kb region. In the remaining 10/11 families, the 143kb CxSV acquired an additional 184kb deletion across the same region, restoring exons 3–19 of OCA2 to a copy-number neutral state. Allele-associated haplotype analysis found rare SNVs rs374519281 and rs139696407 are linked with the 143kb CxSV in both OCA2 alleles. For individuals in which customary molecular evaluation does not reveal a biallelic OCA diagnosis, we recommend preliminary screening for these haplotype-associated rare variants, followed by junction-specific validation for the OCA2 143kb CxSV

    A common TMPRSS2 variant has a protective effect against severe COVID-19

    Get PDF
    Background : The human protein transmembrane protease serine type 2 (TMPRSS2) plays a key role in SARS-CoV-2 infection, as it is required to activate the virus’ spike protein, facilitating entry into target cells. We hypothesized that naturally-occurring TMPRSS2 human genetic variants affecting the structure and function of the TMPRSS2 protein may modulate the severity of SARS-CoV-2 infection. Methods : We focused on the only common TMPRSS2 non-synonymous variant predicted to be damaging (rs12329760 C>T, p.V160M), which has a minor allele frequency ranging from from 0.14 in Ashkenazi Jewish to 0.38 in East Asians. We analysed the association between the rs12329760 and COVID-19 severity in 2,244 critically ill patients with COVID-19 from 208 UK intensive care units recruited as part of the GenOMICC (Genetics Of Mortality In Critical Care) study. Logistic regression analyses were adjusted for sex, age and deprivation index. For in vitro studies, HEK293 cells were co-transfected with ACE2 and either TMPRSS2 wild type or mutant (TMPRSS2V160M). A SARS-CoV-2 pseudovirus entry assay was used to investigate the ability of TMPRSS2V160M to promote viral entry. Results : We show that the T allele of rs12329760 is associated with a reduced likelihood of developing severe COVID-19 (OR 0.87, 95%CI:0.79-0.97, p=0.01). This association was stronger in homozygous individuals when compared to the general population (OR 0.65, 95%CI:0.50-0.84, p=1.3 × 10−3). We demonstrate in vitro that this variant, which causes the amino acid substitution valine to methionine, affects the catalytic activity of TMPRSS2 and is less able to support SARS-CoV-2 spike-mediated entry into cells. Conclusion : TMPRSS2 rs12329760 is a common variant associated with a significantly decreased risk of severe COVID-19. Further studies are needed to assess the expression of TMPRSS2 across different age groups. Moreover, our results identify TMPRSS2 as a promising drug target, with a potential role for camostat mesilate, a drug approved for the treatment of chronic pancreatitis and postoperative reflux esophagitis, in the treatment of COVID-19. Clinical trials are needed to confirm this

    Whole-genome sequencing reveals host factors underlying critical COVID-19

    Get PDF
    Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care(1) or hospitalization(2-4) after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes-including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)-in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.Peer reviewe
    corecore